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Abstract 


This  paper  concerns  the  problem  of  searching ,  with  p  parallel 
processors,  for  a  given  key  in  a  random  ordered  table  of  size  n.  We 
propose  a  parallel  interpolation  algorithm  which  we  show  has  expected 
time  cost  (5)log  (1  +  log  (n) /log  (p) )  +0(1)  and  we  prove  this  algorithm 
has  optimal  expected  time  cost  within  a  constant  additive  term. _ i 
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INTRODUCTION 


1.1  The  Parallel  Search  Problem  for  a  Random  Ordered  Table 

We  assume  a  random  ordered  table 

(L,X  , .  . .  ,X  ,H) 

1  n 

where  L=X^  and  H  =  X  ,  _  are  given  reals,  L  <  H ,  and  X  <  —  <X 
0  n+ 1  1  n 

are  constructed  by  sorting  n  distinct  random  reals  chosen  from 

{x|l<x<h}.  The  table  has  size  n.  Given  a  search  key  y,  L  <  y  <  H , 

we  wish  to  determine  the  index  k*  such  that  X,  <y  <X,  A  . 

k*  k*+l 

We  assume  a  parallel  machine  model  with  p^l  synchronized 

processors  which  in  a  single  step  may  simultaneous  read  p  distinct 

table  keys  X,  , • . . ,X,  at  indices  k  - . . . ,k  determined  by  the  algo- 
Km  k  in 

^  P 

rithm.  The  algorithm  must  then  utilize  these  values  to  determine  the 

indices  of  the  table  keys  to  be  read  in  the  next  step.  (Note  that  as 

in  the  previous  literature  on  search  algorithms,  for  example  [Yao  and 

Yao,  76],  [Perl  and  Itai,  78],  and  [Gonnet,  Rogers  and  George,  80],  we 

do  not  take  into  account  the  cost  to  calculate  these  indices  k  . . . . ,k  .) 

1  P 

The  worst  case  time  for  a'  parallel  search  algorithm  is  the  maximum 
of  the  number  of  steps  required  for  any  key  y  searched  in  any  ordered 

table  of  size  n.  The  expected  time  is  the  maximum,  for  any  given  search 

key  y,  of  the  average  number  of  steps  for  searching  y  in  a  random 
ordered  t<iblc  of  size  n. 

1.2  Binary  Search 

[Knuth,  73]  shows  that  sequential  binary  search  has  time  bound 

Llog(n)j  in  the  worst  case  and  log(n)  in  the  average,  and  that  this 

is  optimal  for  the  sequential  comparison  tree  model.  [Gal  and  Mirankei , 
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67]  and  [Snir,  87]  have  shown  that  a  parallel  binary  search  has  worst 
case  time  < log (n) /log(p)  and  we  can  also  show  this  to  be  the  average 
time  for  parallel  binary  search.  [Snir,  82]  has  also  shown  that  any 
parallel  search  algorithm  must  have  worst  case  time  ^  log  (n) /log  (p)  so 
parallel  binary  search  gives  optimal  worst  case  time.  However,  we  shall 
see  that  parallel  binary  search  is  extremely  nonoptimal  with  respect  to 
expected  time. 

1. 3  Interpolation  Search 

[Peterson,  57]  is  the  first  published  account  of  the  use  of  repeated 
interpolations  to  choose  indices  at  expected  locations  of  the  search  key 
x.  The  expected  time  complexity  of  sequential  interpolation  search  was 
posed  as  an  open  problem  in  [Knuth,  73];  subsequently  it  was  independently 
shown  to  be  loglog(n)  +0(1)  by  [Yao  and  Yao,  76J ,  [Perl  and  Itai,  78]  and 
[Gonnet,  Rogers,  and  George,  80].  [Yao  and  Yao,  76]  proved  that  interpolation 
search  has  optimal  expected  time  for  any  sequential  search  algorithm,  within 
a  constant  additive  term.  Also,  they  found  that  a  constant  number  of 
processors  do  not  speed  up  interpolative  search  by  more  than  a  constant 
additive  factor.  They  pose  as  an  open  problem  the  expected  time  complexity 
of  searching  with  p  processors,  when  p  is  a  function  of  n. 

1.4  Results  of  this  Paper 

In  this  paper  we  propose  two  algorithms  for  parallel  search; 

(a)  Parallel  Pseudo  Interpolative  Search  is  a  generalization  of  a 
sequential  pseudo  interpolation  search  of  [Perl  and  Reingold,  77].  We 
show  in  Section  2  that  this  parallel  search  algorithm  has  expected  time 


<  C(e)log(l  +  log(n)/(4  log(ep)))  +0(1)  where  £>0  is  a  parameter  of 
the  algorithm  which  may  be  set  arbitrarily  and  C(e)  ^1  is  a  constant 
that  approaches  1  exponentially  fast  as  £-*0.  For  example  C ( 1)  <2.03. 

(b)  Parallel  Interpolation  Search  is  the  natural  generalization  of 
the  sequential  interpol  ation  search.  We  show  this  parallel  search 
algorithm  has  expected  time  < log (1 +  log (n) /log (p) )  +0(1). 

In  Section  4  we  show  that  any  parallel  search  algorithm  requires 
time  >  log(l  +  log(n) /log  (p) )  -  c,  where  c^O  is  a  constant.  Thus  our 
parallel  interpolation  search  algorithm  has  optimal  expected  time. 

Section  5  concludes  this  paper. 
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2.  PARALLEL  PSEUDO  INTERPOLATION  SEARCH  (PIS  ) 


This  section  describes  a  parallel  algorithm  PIS  for  searching 

£  ,p 

by  using  repeated  phases  of  a  single  interpolation  followed  by  a  series 
of  parallel  probes  at  equally  spaced  intervals,  shifted  linearly  from 
the  interpolation.  Our  algorithm  is  a  generalization  of  the  sequential 
pseudo  interpolation  search  algorithm  of  [Perl  and  Reingold,  77]  where 
here  we  utilize  p  processors  and  introduce  a  parameter  £  >  0  which 
may  be  fixed  to  improve  the  performance  of  the  algorithm  ( [Perl  and 
Reingold,  77]  only  describe  the  case  where  p=l  and  £  =  1). 


2.1  Algorithm  PIS 

-  £,p 


Input  random  ordered  table  (L  =  Xn,X  . . . .  ,X  ,H  =  X  and 

u  1  n  n+ 1 

search  key  y ,  L  <  y  <  H . 

Output  index  k*  where  X,  .  <  y  <  X 

k*  1  k*+l 

Note  that  k*  is  a  binomial  random  variable  whose  distribution 
function  has  parameters  a,  n  where  a  *  (y  -  L)/(H  -  L)  .  Thus  y  =  an  is 
the  expected  value  of  k*.  Our  immediate  goal  will  be  to  determine  k* 
within  bounds  of  distance  6  =  \/n/ (£p) .  These  bounds  will  be  further 
reduced  by  recursive  calls  to  the  algorithm.  Define  indices  k^ «  ry +  ifi1 
for  all  integers  i,  &0<i<h0,  where  £Q  =  rl  -  y/61  and  hQ  -  n  -  Ly/6j  . 

We  will  repeatedly  execute  the  following  loop: 

Intially  let  i  -^0.  In  parallel  read  values  X,  for  each  i, 

0  k. 


£<i<h,  where  £  =  max(AQ,io  -  Lp/2j  +  1)  and  h  =  min(h0,i0  +  rp/21 )  .  If 

y<X  than  repeat  the  above  step  with  i+i  -p.  If  y  >  X,  then 

kA  00  kh 

repeat  the  above  step  with  ig^ig  +  p.  otherwise  there  exists  i#  i+1 


where  Jl<i,  i+l<h  and  X,  <y<x 

ki  ki+l 


If  6  =  1  then  halt  with  output 
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Since  the  loop  is  always  executed  at  least  once*  S£>1.  We  have  already 

noted  that  k*  has  a  binomial  distribution  with  parameters  a,  p  and 
2 

thus  variance  0  =  a(l-a)n.  We  can  approximate  the  distribution  of 
k*  by  a  normal  distribution  (see  [Feller,  68]),  giving  for  s^2 

Prob{s  >s}  <1 - — 

G  /2v 
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where 

u  =  (s-1) 6/0 

=  (s-l)/(ep/a(l-a) ) 

<  2(s-l)/(ep)  .  a 

Let  T  (n)  be  the  expected  number  of  steps  executed  by  algorithm 

^  *  P 

PIS^  for  a  table  of  size  n. 

THEOREM  1.  T  (n)  <  C(e)  log(l  4  log/(4  log  (£p> ) )  +  o  (1)  for  ep>2. 
t  /P 

Proof.  Clearly  T  (p)  =1.  Since  each  recursive  call  reduces  a 
e#p 

table  of  size  n  to  a  subtable  of  size  6  =  i/n/(ep)  we  have 


T  (n)  <  T  (r«4i/(ep)1 )  +  C(E) 
c  t*  #  P 

<  T  _(r(n1/2ep) 2  (ep)'21)  +  tc(e) 

^  f  P 

=  T  (p)  +  tC(E) 

^  9  ir 

for 

t  <  log(l +  log (n)/(4  log(ep)))  +  0(1)  . 


□ 
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3.  PARALLEL  INTERPOLATION  SEARCH  (IS  ) 

P 

This  Section  presents  a  parallel  search  algorithm  which  seems  a 
natural  generization  of  the  sequential  interpolation  search  of  [Peterson, 
57]  to  p  parallel  processors. 

3,1  Algorithm  IS^ 

Input  random  ordered  table  (L =  ,X, , . . . ,X  ,H  =  X  and  search 

0  1  n  n+ 1 

key  y ,  L  <  y  <  H . 

Output  index  k*  where  <  y  <  X^+^. 

Intially  assign  l<-  0  and  assign  h«-n+l. 

Reneat  forever  the  following  loop : 

We  can  assume  that  we  have  already  read  the  values  of  and  X^ 

and  must  search  for  y  in  the  random  ordered  table  (Xrt ,  Xn  X.  n ,X,  ) 

x,  x,+  l  h-1  h 

Assign  n’^-h-Jl-l.  If  p^n*  then  all  of  X^+^/  •  •  •  *X^  can  be  read 

in  a  single  step  so  we  can  halt  and  output  k*.  Otherwise,  assign 

-  X^) .  Note  that  k*  -  h  has  a  binomial  distribution  with 

parameters  a  and  n*.  Let  IB  be  the  functional  inverse  of  this 

binomial  distribution  function  (i.e.  ,  z  =  Prob{k*  -  £ < IB (z) }  for  0 < z <  1) 

Assign  the  indices  k^  «-  +  IB  (i/Cp+l))1  for  i=l, _ ,p.  Also  let  kQ  «-  £ 

and  kp+^*“h.  In  a  single  parallel  step  read  for  i  =  l,...,p. 

Assign  d  to  be  the  maximum  integer  such  that  X^  < y  < X^  for  all 

a  Ki 

i,  d<i^p  +  l.  Finally  reassign  Jt«-k,,  h^kJ(,  and  repeat  the  above 

a  a+1 


(y  ~V/(Xh 


loop. 


V 
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For  each  t=l,2,...  let  Jt(t)  ,  h(t)  be  the  values  of  l,  h 

respectively  on  entering  the  t-th  iteration  of  the  loop  of  IS  ,  and  let 

P 

n* (t) ,  a(t) ,  IB  ,  k^ (t) ,  d(t)  be  the  values  of  the  corresponding 
variables  as  defined  in  the  t-th  iteration  of  the  loop .  For  notational 
simplicity  let  k(t)  =  kd ( (t)  and  let  k(0)  =0.  Let  A(t)  *  |k(t)  - 
k(t-l)(  and  A±(t)  =  (k±(t)  -k(t-l)  |  for  i  =  l,...,p.  By  definition 
we  have  for  i *  1 , . . . ,p 


PROPOSITION  1. 


A,  (t) 

i 


The  histovy  sequence  previous  to  the  t-th  iteration  is 

o(t)  =  'xh(1)  »h(l)  )  I  •  .  .  f  (X^  |Xj^  f^(t)  r  h  ( t) ) ) 

For  each  i=l,...,p  let  0±(t)  be  the  event  a(t)  and  d(t)  €  {i-l,i}. 
Again  by  definition  it  follows  that  for  i  =  l,...,p 

PROPOSITION  2.  kpt)  =  E(k*|o.  (t))  . 

Since  each  of  the  conditional  events  (a^(t)|o(t))  are  equally 
likely, 

PROPOSITION  3. 

,  P 

A(t)  =  ~  V  A.  (t) 

P  i=l  1 
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Proof.  Define  r (t)  to  be  a  continuous  function  which  is  a  linear 
interpolation  of  2tlog(A(t))  -t  log  (p)  at  distinct  points  t=0,l,... 
where  A (0)  =  n.  Then  T(t)  is  a  supermartingale  with  respect  to 
history  sequence  O(t)  since  by  Lemma  2,  E (T (t+1) | a (t) )  ^  T(t).  Let 
the  stopping  time  T  be  the  minimum  t^O  such  that  T(t)  <  (2t-t)  log  (p)  , 
so  A t rT*1 )  <p.  Then  by  the  Optimal  Stopping  Theorem  (see  (Karlin  and 
Taylor ,  75]), 

E  (T  (T) )  <  E  (I  (0) )  =  log  (n)  . 


Thus 


T  (n) 
P 


E(T)  <  log(l  +  log  (n) /log  (p) )  +0(1)  . 


4. 


PARALLEL  INTERPOLATION  SEARCH  IS  OPTIMAL 


This  section  shows  that  the  parallel  interpolation  search  algorithm 
described  in  Section  3  is  optimal  within  a  constant  additive  term.  We 
actually  prove  a  slightly  stronger  theorem  for  which  our  optimality 
result  follows  as  a  corollary. 

Given  a  random  ordered  table  (L,X^, • . . and  a  search  key 
y,  L<y<H,  let  the  expected  distance  of  this  search  problem  be 
A  »  ran**  where  a  =  (y-L)/(H-L).  For  integers  p^l  and  A^l  let 
Rp(X)  be  the  minimum  expected  time  cost  for  any  parallel  algorithm 
with  p  processors  to  solve  all  search  problems  with  expected  distance 
A.  The  following  theorem  is  a  direct  generalization  of  a  lower  bound 
proof  due  to  [Yao  and  Yao,  76]  for  sequential  search  of  random  ordered 
tables. 

theorem  3.  R  (A)  >  f  (A)  -  c  +  1/f  (A)  where  c>0  is  a  constant 
P  P  P 

and  fp(A)  =  log(l  +  log  (A) /log  (p) )  . 

Proof  by  induction.  Suppose  the  theorem  holds  for  all  A'  <  A. 

Let  (L,X, ,...,X  ,H)  be  a  random  ordered  table  with  search  key  y  and 
1  n 

expected  distance  A.  Fix  a  parallel  search  algorithm  A^  with  p 
processors.  Let  be  the  choice  by  A^  of  key  indices  to  be 

read  on  the  first  step.  Then  from  these  indices  we  can  show  there 
exists  a  set  Jc  [L,H]  such  that  if  HIT  is  the  event  X^  C  J  for 

i 

some  i , 

(1)  Prob(HIT)  <\~2C,  and 

(2)  not  HIT  implies  that  on  the  second  step  of  A^  the  resulting 
search  problem  has  expected  distance  >  A* ,  where 


A 
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d/2-C 

X1  =  - - 

P 

and 

C  =  (ln(X)/(2  ln(p)))"1/2. 

Thus 

R  (X)  >  1  +  Prob(not  HIT) -R  (X') 

P  P 

>  1+  (1-X"2e)  (fp(X')  -c  +  l/fp(X’)) 

>  f  (X)  -  c  +  1/f  (X)  .  □ 

P  P 


For  any  table  size  n,  R^fX)  is  maximum  when  the  search  key  is 
chosen  so  X-n/2*1.  By  Theorem  3,  we  can  always  find  a  constant  c^O 
such  that 


COROLLARY.  Any  parallel  search  algorithm  Ap  with  p  processors 
has  expected  time 


^  log(l +  log (n) /log (p) )  -  c  . 
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5.  CONCLUSION 

We  have  determined,  within  an  additive  constant,  the  average  case 
complexity  of  parallel  searching  an  ordered  random  table  with  keys 
independently  chosen  from  a  uniform  distribution-  Our  results  can  be 
extended  to  nonuniform  distributions  which  have  invertible  cumulative 
distribution  functions. 

Of  the  two  algorithms  we  describe  in  this  paper  the  parallel 
pseudo  interpolation  algorithm  of  Section  2  may  be  more  practical 
since  the  interpolations  require  only  calculation  of  the  mean  of  a 
binomial  whereas  the  parallel  interpolation  algorithm  of  Section  3 
reouires  calculating  the  inverse  of  the  cumulative  distribution 
function  of  a  binomial. 
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